Add Browser Env Integration #732

filip-michalsky · 2026-01-15T11:46:19Z

Description

Adds BrowserEnv - a unified browser automation integration for the verifiers library supporting two operational modes:

DOM Mode (mode="dom")

Uses the Stagehand Python SDK for natural language browser control
Tools: navigate, observe, act, extract - Stagehand's AI-driven primitives
Ideal for tasks that benefit from semantic understanding of page elements

CUA Mode (mode="cua")

Vision-based primitives for Computer Use Agent workflows
Tools: click, double_click, type_text, keypress, scroll, goto, back, forward, wait, screenshot
Requires companion TypeScript server (included) for CDP connection via Stagehand internals
Automatic screenshot management with context trimming for VLM input

Both modes support local browser execution or Browserbase cloud infrastructure.

What's included:

verifiers/envs/integrations/browser_env/ - Core integration (BrowserEnv, DOMMode, CUAMode)
verifiers/envs/integrations/browser_env/cua-server/ - TypeScript server for CUA mode
environments/browser_dom_example/ - Minimal DOM mode example
environments/browser_cua_example/ - Minimal CUA mode example
New [browser] extra: uv add 'verifiers[browser]'

Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the browserbase/ namespace.

Type of Change

New feature (non-breaking change which adds functionality)

Testing

# DOM mode
prime eval run browserbase/browser-dom-example -m openai/gpt-4.1-mini

# CUA mode (start server first: cd verifiers/envs/integrations/browser_env/cua-server && ./start.sh)
prime eval run browserbase/browser-cua-example -m qwen/qwen3-vl-30b-a3b-instruct

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

Future work:

Compile CUA TypeScript server to binary to remove Node.js dependency
Additional benchmark environments available on Prime Hub under browserbase/ org
~

Note

Adds a new browser automation integration with two modes and supporting assets.

Introduces BrowserEnv (DOM via Stagehand, CUA via vision primitives) with default prompts, env var validation, tool handling, screenshot filtering, and mode-specific message formatting
Exposes BrowserEnv in verifiers/__init__.py and adds integration package under verifiers/envs/integrations/browser_env
Provides example environments: environments/browser_dom_example and environments/browser_cua_example (docs, datasets, loaders, pyprojects)
Bundles CUA server (verifiers/envs/integrations/browser_env/cua-server/) with Fastify/Stagehand code and scripts
Adds [browser] optional dependency group in pyproject.toml and updates integration docs (docs/environments.md, verifiers/envs/integrations/README.md, environments/AGENTS.md)
Adds comprehensive tests for modes, prompts, validation, DOM LLM config, screenshot filtering, example datasets, and updates tests/test_envs.py skip list

^{Written by Cursor Bugbot for commit e688da6. This will update automatically on new commits. Configure here.}

CLAassistant · 2026-01-15T11:46:34Z

All committers have signed the CLA.